[clang][modules] Stop uniquing implicit modules via FileEntry#185765
[clang][modules] Stop uniquing implicit modules via FileEntry#185765jansvoboda11 wants to merge 11 commits intollvm:mainfrom
FileEntry#185765Conversation
e71185b to
e558b32
Compare
|
@llvm/pr-subscribers-clangd @llvm/pr-subscribers-clang-tools-extra Author: Jan Svoboda (jansvoboda11) ChangesThis PR changes how Previously,
This PR identifies implicitly-built module files by a pair of Some tests needed to be updated because the module cache path directory was also used as an include directory. This PR relies on not caching the non-existence of the module cache directory in the Patch is 84.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/185765.diff 34 Files Affected:
diff --git a/clang-tools-extra/clangd/ModulesBuilder.cpp b/clang-tools-extra/clangd/ModulesBuilder.cpp
index 524ec620c4076..b5a0127d48b7e 100644
--- a/clang-tools-extra/clangd/ModulesBuilder.cpp
+++ b/clang-tools-extra/clangd/ModulesBuilder.cpp
@@ -275,8 +275,8 @@ bool IsModuleFileUpToDate(PathRef ModuleFilePath,
// listener.
Reader.setListener(nullptr);
- if (Reader.ReadAST(ModuleFilePath, serialization::MK_MainFile,
- SourceLocation(),
+ if (Reader.ReadAST(ModuleFileName::make_explicit(ModuleFilePath),
+ serialization::MK_MainFile, SourceLocation(),
ASTReader::ARR_None) != ASTReader::Success)
return false;
diff --git a/clang/include/clang/Basic/Module.h b/clang/include/clang/Basic/Module.h
index 69a1de6f79b35..c512249e5e945 100644
--- a/clang/include/clang/Basic/Module.h
+++ b/clang/include/clang/Basic/Module.h
@@ -54,6 +54,101 @@ class TargetInfo;
/// Describes the name of a module.
using ModuleId = SmallVector<std::pair<std::string, SourceLocation>, 2>;
+namespace serialization {
+class ModuleManager;
+} // namespace serialization
+
+/// Deduplication key for a loaded module file in \c ModuleManager.
+///
+/// For implicitly-built modules, this is the \c DirectoryEntry of the module
+/// cache and the module file name with the (optional) context hash.
+/// This enables using \c FileManager's inode-based canonicalization of the
+/// user-provided module cache path without hitting issues on file systems that
+/// recycle inodes for recompiled module files.
+///
+/// For explicitly-built modules, this is \c FileEntry.
+/// This uses \c FileManager's inode-based canonicalization of the user-provided
+/// module file path. Because input explicitly-built modules do not change
+/// during the lifetime of the compiler, inode recycling is not of concern here.
+class ModuleFileKey {
+ /// The FileManager entity used for deduplication.
+ const void *Ptr;
+ /// The path relative to the module cache path for implicit module file, empty
+ /// for other kinds of module files.
+ std::string PathSuffix;
+
+ friend class serialization::ModuleManager;
+ friend class ModuleFileName;
+ friend llvm::DenseMapInfo<ModuleFileKey>;
+
+ ModuleFileKey(const void *Ptr) : Ptr(Ptr) {}
+
+ ModuleFileKey(const FileEntry *ModuleFile) : Ptr(ModuleFile) {}
+
+ ModuleFileKey(const DirectoryEntry *ModuleCacheDir, StringRef PathSuffix)
+ : Ptr(ModuleCacheDir), PathSuffix(PathSuffix) {}
+
+public:
+ bool operator==(const ModuleFileKey &Other) const {
+ return Ptr == Other.Ptr && PathSuffix == Other.PathSuffix;
+ }
+
+ bool operator!=(const ModuleFileKey &Other) const {
+ return !operator==(Other);
+ }
+};
+
+/// Identifies a module file to be loaded.
+///
+/// For implicitly-built module files, the path is split into the module cache
+/// path and the module file name with the (optional) context hash. For all
+/// other types of module files, this is just the file system path.
+class ModuleFileName {
+ std::string Path;
+ std::optional<uint32_t> Separator;
+
+public:
+ /// Creates an empty module file name.
+ ModuleFileName() = default;
+
+ /// Creates a file name for an explicit module.
+ static ModuleFileName make_explicit(std::string Name) {
+ ModuleFileName File;
+ File.Path = std::move(Name);
+ return File;
+ }
+
+ /// Creates a file name for an explicit module.
+ static ModuleFileName make_explicit(StringRef Name) {
+ return make_explicit(Name.str());
+ }
+
+ /// Creates a file name for an implicit module.
+ static ModuleFileName make_implicit(std::string Name, uint32_t Separator) {
+ ModuleFileName File;
+ File.Path = std::move(Name);
+ File.Separator = Separator;
+ return File;
+ }
+
+ /// Creates a file name for an implicit module.
+ static ModuleFileName make_implicit(StringRef Name, uint32_t Separator) {
+ return make_implicit(Name.str(), Separator);
+ }
+
+ /// Returns the plain module file name.
+ StringRef str() const { return Path; }
+
+ /// Converts to StringRef representing the plain module file name.
+ operator StringRef() const { return Path; }
+
+ /// Checks whether the module file name is empty.
+ bool empty() const { return Path.empty(); }
+
+ /// Creates the deduplication key for use in \c ModuleManager.
+ std::optional<ModuleFileKey> makeKey(FileManager &FileMgr) const;
+};
+
/// The signature of a module, which is a hash of the AST content.
struct ASTFileSignature : std::array<uint8_t, 20> {
using BaseT = std::array<uint8_t, 20>;
@@ -926,4 +1021,23 @@ class VisibleModuleSet {
} // namespace clang
+template <> struct llvm::DenseMapInfo<clang::ModuleFileKey> {
+ static clang::ModuleFileKey getEmptyKey() {
+ return DenseMapInfo<const void *>::getEmptyKey();
+ }
+
+ static clang::ModuleFileKey getTombstoneKey() {
+ return DenseMapInfo<const void *>::getTombstoneKey();
+ }
+
+ static unsigned getHashValue(const clang::ModuleFileKey &Val) {
+ return hash_combine(Val.Ptr, Val.PathSuffix);
+ }
+
+ static bool isEqual(const clang::ModuleFileKey &LHS,
+ const clang::ModuleFileKey &RHS) {
+ return LHS == RHS;
+ }
+};
+
#endif // LLVM_CLANG_BASIC_MODULE_H
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index 266e0826b38f4..0d684d5c7f9fe 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -17,6 +17,7 @@
#include "clang/Frontend/PCHContainerOperations.h"
#include "clang/Frontend/Utils.h"
#include "clang/Lex/DependencyDirectivesScanner.h"
+#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/HeaderSearchOptions.h"
#include "clang/Lex/ModuleLoader.h"
#include "llvm/ADT/ArrayRef.h"
@@ -746,11 +747,6 @@ class CompilerInstance : public ModuleLoader {
GetDependencyDirectives = std::move(Getter);
}
- std::string getSpecificModuleCachePath(StringRef ContextHash);
- std::string getSpecificModuleCachePath() {
- return getSpecificModuleCachePath(getInvocation().computeContextHash());
- }
-
/// Create the AST context.
void createASTContext();
@@ -872,7 +868,7 @@ class CompilerInstance : public ModuleLoader {
void createASTReader();
- bool loadModuleFile(StringRef FileName,
+ bool loadModuleFile(ModuleFileName FileName,
serialization::ModuleFile *&LoadedModuleFile);
/// Configuration object for making the result of \c cloneForModuleCompile()
diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h
index d7000da682c6e..fd52a0ad5b007 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -282,6 +282,10 @@ class HeaderSearch {
/// The specific module cache path containing ContextHash (unless suppressed).
std::string SpecificModuleCachePath;
+ /// The length of the normalized module cache path at the start of \c
+ /// SpecificModuleCachePath.
+ size_t NormalizedModuleCachePathLen = 0;
+
/// All of the preprocessor-specific data about files that are
/// included, indexed by the FileEntry's UID.
mutable std::vector<HeaderFileInfo> FileInfo;
@@ -467,20 +471,20 @@ class HeaderSearch {
return {};
}
- /// Set the context hash to use for module cache paths.
- void setContextHash(StringRef Hash) { ContextHash = std::string(Hash); }
+ /// Initialize the module cache path.
+ void initializeModuleCachePath(std::string ContextHash);
- /// Set the module cache path with the context hash (unless suppressed).
- void setSpecificModuleCachePath(StringRef Path) {
- SpecificModuleCachePath = std::string(Path);
+ /// Retrieve the module cache path with the context hash (unless suppressed).
+ StringRef getSpecificModuleCachePath() const {
+ return SpecificModuleCachePath;
}
/// Retrieve the context hash.
StringRef getContextHash() const { return ContextHash; }
- /// Retrieve the module cache path with the context hash (unless suppressed).
- StringRef getSpecificModuleCachePath() const {
- return SpecificModuleCachePath;
+ /// Retrieve the normalized module cache path.
+ StringRef getNormalizedModuleCachePath() const {
+ return getSpecificModuleCachePath().substr(0, NormalizedModuleCachePathLen);
}
/// Forget everything we know about headers so far.
@@ -657,7 +661,7 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getCachedModuleFileName(Module *Module);
+ ModuleFileName getCachedModuleFileName(Module *Module);
/// Retrieve the name of the prebuilt module file that should be used
/// to load a module with the given name.
@@ -669,8 +673,8 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getPrebuiltModuleFileName(StringRef ModuleName,
- bool FileMapOnly = false);
+ ModuleFileName getPrebuiltModuleFileName(StringRef ModuleName,
+ bool FileMapOnly = false);
/// Retrieve the name of the prebuilt module file that should be used
/// to load the given module.
@@ -679,7 +683,7 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getPrebuiltImplicitModuleFileName(Module *Module);
+ ModuleFileName getPrebuiltImplicitModuleFileName(Module *Module);
/// Retrieve the name of the (to-be-)cached module file that should
/// be used to load a module with the given name.
@@ -691,8 +695,8 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getCachedModuleFileName(StringRef ModuleName,
- StringRef ModuleMapPath);
+ ModuleFileName getCachedModuleFileName(StringRef ModuleName,
+ StringRef ModuleMapPath);
/// Lookup a module Search for a module with the given name.
///
@@ -808,13 +812,13 @@ class HeaderSearch {
/// \param ModuleMapPath A path that when combined with \c ModuleName
/// uniquely identifies this module. See Module::ModuleMap.
///
- /// \param CachePath A path to the module cache.
+ /// \param NormalizedCachePath The normalized path to the module cache.
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getCachedModuleFileNameImpl(StringRef ModuleName,
- StringRef ModuleMapPath,
- StringRef CachePath);
+ ModuleFileName getCachedModuleFileNameImpl(StringRef ModuleName,
+ StringRef ModuleMapPath,
+ StringRef NormalizedCachePath);
/// Retrieve a module with the given name, which may be part of the
/// given framework.
@@ -1042,6 +1046,11 @@ void ApplyHeaderSearchOptions(HeaderSearch &HS,
void normalizeModuleCachePath(FileManager &FileMgr, StringRef Path,
SmallVectorImpl<char> &NormalizedPath);
+std::string createSpecificModuleCachePath(FileManager &FileMgr,
+ StringRef ModuleCachePath,
+ bool DisableModuleHash,
+ std::string ContextHash);
+
} // namespace clang
#endif // LLVM_CLANG_LEX_HEADERSEARCH_H
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index 752e7fd288aa6..4e8fe1d32d42e 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -44,7 +44,7 @@ namespace serialization {
/// Version 4 of AST files also requires that the version control branch and
/// revision match exactly, since there is no backward compatibility of
/// AST files at this time.
-const unsigned VERSION_MAJOR = 35;
+const unsigned VERSION_MAJOR = 37;
/// AST file minor version number supported by this version of
/// Clang.
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index f254459ce933d..e9706d0ea2f2b 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -185,8 +185,7 @@ class ASTReaderListener {
/// otherwise.
virtual bool ReadHeaderSearchOptions(const HeaderSearchOptions &HSOpts,
StringRef ModuleFilename,
- StringRef SpecificModuleCachePath,
- bool Complain) {
+ StringRef ContextHash, bool Complain) {
return false;
}
@@ -304,8 +303,7 @@ class ChainedASTReaderListener : public ASTReaderListener {
bool Complain) override;
bool ReadHeaderSearchOptions(const HeaderSearchOptions &HSOpts,
- StringRef ModuleFilename,
- StringRef SpecificModuleCachePath,
+ StringRef ModuleFilename, StringRef ContextHash,
bool Complain) override;
bool ReadPreprocessorOptions(const PreprocessorOptions &PPOpts,
StringRef ModuleFilename, bool ReadMacros,
@@ -349,8 +347,7 @@ class PCHValidator : public ASTReaderListener {
bool Complain,
std::string &SuggestedPredefines) override;
bool ReadHeaderSearchOptions(const HeaderSearchOptions &HSOpts,
- StringRef ModuleFilename,
- StringRef SpecificModuleCachePath,
+ StringRef ModuleFilename, StringRef ContextHash,
bool Complain) override;
void ReadCounter(const serialization::ModuleFile &M, uint32_t Value) override;
};
@@ -1552,7 +1549,7 @@ class ASTReader
: Mod(Mod), ImportedBy(ImportedBy), ImportLoc(ImportLoc) {}
};
- ASTReadResult ReadASTCore(StringRef FileName, ModuleKind Type,
+ ASTReadResult ReadASTCore(ModuleFileName FileName, ModuleKind Type,
SourceLocation ImportLoc, ModuleFile *ImportedBy,
SmallVectorImpl<ImportedModule> &Loaded,
off_t ExpectedSize, time_t ExpectedModTime,
@@ -1888,7 +1885,7 @@ class ASTReader
/// NewLoadedModuleFile would refer to the address of the new loaded top level
/// module. The state of NewLoadedModuleFile is unspecified if the AST file
/// isn't loaded successfully.
- ASTReadResult ReadAST(StringRef FileName, ModuleKind Type,
+ ASTReadResult ReadAST(ModuleFileName FileName, ModuleKind Type,
SourceLocation ImportLoc,
unsigned ClientLoadCapabilities,
ModuleFile **NewLoadedModuleFile = nullptr);
diff --git a/clang/include/clang/Serialization/ModuleFile.h b/clang/include/clang/Serialization/ModuleFile.h
index 519a74d920129..4915ad0634fcd 100644
--- a/clang/include/clang/Serialization/ModuleFile.h
+++ b/clang/include/clang/Serialization/ModuleFile.h
@@ -159,6 +159,9 @@ class ModuleFile {
/// The file name of the module file.
std::string FileName;
+ /// All keys ModuleManager used for the module file.
+ llvm::DenseSet<ModuleFileKey> Keys;
+
/// The name of the module.
std::string ModuleName;
diff --git a/clang/include/clang/Serialization/ModuleManager.h b/clang/include/clang/Serialization/ModuleManager.h
index 8ab70b6630f47..162856f2f14c0 100644
--- a/clang/include/clang/Serialization/ModuleManager.h
+++ b/clang/include/clang/Serialization/ModuleManager.h
@@ -16,6 +16,7 @@
#include "clang/Basic/LLVM.h"
#include "clang/Basic/SourceLocation.h"
+#include "clang/Lex/HeaderSearch.h"
#include "clang/Serialization/ModuleFile.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"
@@ -56,8 +57,8 @@ class ModuleManager {
// to implement short-circuiting logic when running DFS over the dependencies.
SmallVector<ModuleFile *, 2> Roots;
- /// All loaded modules, indexed by name.
- llvm::DenseMap<const FileEntry *, ModuleFile *> Modules;
+ /// All loaded modules.
+ llvm::DenseMap<ModuleFileKey, ModuleFile *> Modules;
/// FileManager that handles translating between filenames and
/// FileEntry *.
@@ -172,14 +173,22 @@ class ModuleManager {
ModuleFile &operator[](unsigned Index) const { return *Chain[Index]; }
/// Returns the module associated with the given file name.
+ /// Soon to be removed.
ModuleFile *lookupByFileName(StringRef FileName) const;
/// Returns the module associated with the given module name.
ModuleFile *lookupByModuleName(StringRef ModName) const;
/// Returns the module associated with the given module file.
+ /// Soon to be removed.
ModuleFile *lookup(const FileEntry *File) const;
+ /// Returns the module associated with the given module file name.
+ ModuleFile *lookupByFileName(ModuleFileName FileName) const;
+
+ /// Returns the module associated with the given module file key.
+ ModuleFile *lookup(ModuleFileKey Key) const;
+
/// Returns the in-memory (virtual file) buffer with the given name
std::unique_ptr<llvm::MemoryBuffer> lookupBuffer(StringRef Name);
@@ -237,14 +246,13 @@ class ModuleManager {
///
/// \return A pointer to the module that corresponds to this file name,
/// and a value indicating whether the module was loaded.
- AddModuleResult addModule(StringRef FileName, ModuleKind Type,
- SourceLocation ImportLoc,
- ModuleFile *ImportedBy, unsigned Generation,
- off_t ExpectedSize, time_t ExpectedModTime,
+ AddModuleResult addModule(ModuleFileName FileName, ModuleKind Type,
+ SourceLocation ImportLoc, ModuleFile *ImportedBy,
+ unsigned Generation, off_t ExpectedSize,
+ time_t ExpectedModTime,
ASTFileSignature ExpectedSignature,
ASTFileSignatureReader ReadSignature,
- ModuleFile *&Module,
- std::string &ErrorStr);
+ ModuleFile *&Module, std::string &ErrorStr);
/// Remove the modules starting from First (to the end).
void removeModules(ModuleIterator First);
@@ -282,26 +290,6 @@ class ModuleManager {
void visit(llvm::function_ref<bool(ModuleFile &M)> Visitor,
llvm::SmallPtrSetImpl<ModuleFile *> *ModuleFilesHit = nullptr);
- /// Attempt to resolve the given module file name to a file entry.
- ///
- /// \param FileName The name of the module file.
- ///
- /// \param ExpectedSize The size that the module file is expected to have.
- /// If the actual size differs, the resolver should return \c true.
- ///
- /// \param ExpectedModTime The modification time that the module file is
- /// expected to have. If the actual modification time differs, the resolver
- /// should return \c true.
- ///
- /// \param File Will be set to the file if there is one, or null
- /// otherwise.
- ///
- /// \returns True if a file exists but does not meet the size/
- /// modification time criteria, false if the file is either available and
- /// suitable, or is missing.
- bool lookupModuleFile(StringRef FileName, off_t ExpectedSize,
- time_t ExpectedModTime, OptionalFileEntryRef &File);
-
/// View the graphviz representation of the module graph.
void viewGraph();
diff --git a/clang/lib/Basic/Module.cpp b/clang/lib/Basic/Module.cpp
index 97f742d292224..bfc29e32c3672 100644
--- a/clang/lib/Basic/Module.cpp
+++ b/clang/lib/Basic/Module.cpp
@@ -33,6 +33,23 @@
using namespace clang;
+s...
[truncated]
|
|
@llvm/pr-subscribers-clang-modules Author: Jan Svoboda (jansvoboda11) ChangesThis PR changes how Previously,
This PR identifies implicitly-built module files by a pair of Some tests needed to be updated because the module cache path directory was also used as an include directory. This PR relies on not caching the non-existence of the module cache directory in the Patch is 84.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/185765.diff 34 Files Affected:
diff --git a/clang-tools-extra/clangd/ModulesBuilder.cpp b/clang-tools-extra/clangd/ModulesBuilder.cpp
index 524ec620c4076..b5a0127d48b7e 100644
--- a/clang-tools-extra/clangd/ModulesBuilder.cpp
+++ b/clang-tools-extra/clangd/ModulesBuilder.cpp
@@ -275,8 +275,8 @@ bool IsModuleFileUpToDate(PathRef ModuleFilePath,
// listener.
Reader.setListener(nullptr);
- if (Reader.ReadAST(ModuleFilePath, serialization::MK_MainFile,
- SourceLocation(),
+ if (Reader.ReadAST(ModuleFileName::make_explicit(ModuleFilePath),
+ serialization::MK_MainFile, SourceLocation(),
ASTReader::ARR_None) != ASTReader::Success)
return false;
diff --git a/clang/include/clang/Basic/Module.h b/clang/include/clang/Basic/Module.h
index 69a1de6f79b35..c512249e5e945 100644
--- a/clang/include/clang/Basic/Module.h
+++ b/clang/include/clang/Basic/Module.h
@@ -54,6 +54,101 @@ class TargetInfo;
/// Describes the name of a module.
using ModuleId = SmallVector<std::pair<std::string, SourceLocation>, 2>;
+namespace serialization {
+class ModuleManager;
+} // namespace serialization
+
+/// Deduplication key for a loaded module file in \c ModuleManager.
+///
+/// For implicitly-built modules, this is the \c DirectoryEntry of the module
+/// cache and the module file name with the (optional) context hash.
+/// This enables using \c FileManager's inode-based canonicalization of the
+/// user-provided module cache path without hitting issues on file systems that
+/// recycle inodes for recompiled module files.
+///
+/// For explicitly-built modules, this is \c FileEntry.
+/// This uses \c FileManager's inode-based canonicalization of the user-provided
+/// module file path. Because input explicitly-built modules do not change
+/// during the lifetime of the compiler, inode recycling is not of concern here.
+class ModuleFileKey {
+ /// The FileManager entity used for deduplication.
+ const void *Ptr;
+ /// The path relative to the module cache path for implicit module file, empty
+ /// for other kinds of module files.
+ std::string PathSuffix;
+
+ friend class serialization::ModuleManager;
+ friend class ModuleFileName;
+ friend llvm::DenseMapInfo<ModuleFileKey>;
+
+ ModuleFileKey(const void *Ptr) : Ptr(Ptr) {}
+
+ ModuleFileKey(const FileEntry *ModuleFile) : Ptr(ModuleFile) {}
+
+ ModuleFileKey(const DirectoryEntry *ModuleCacheDir, StringRef PathSuffix)
+ : Ptr(ModuleCacheDir), PathSuffix(PathSuffix) {}
+
+public:
+ bool operator==(const ModuleFileKey &Other) const {
+ return Ptr == Other.Ptr && PathSuffix == Other.PathSuffix;
+ }
+
+ bool operator!=(const ModuleFileKey &Other) const {
+ return !operator==(Other);
+ }
+};
+
+/// Identifies a module file to be loaded.
+///
+/// For implicitly-built module files, the path is split into the module cache
+/// path and the module file name with the (optional) context hash. For all
+/// other types of module files, this is just the file system path.
+class ModuleFileName {
+ std::string Path;
+ std::optional<uint32_t> Separator;
+
+public:
+ /// Creates an empty module file name.
+ ModuleFileName() = default;
+
+ /// Creates a file name for an explicit module.
+ static ModuleFileName make_explicit(std::string Name) {
+ ModuleFileName File;
+ File.Path = std::move(Name);
+ return File;
+ }
+
+ /// Creates a file name for an explicit module.
+ static ModuleFileName make_explicit(StringRef Name) {
+ return make_explicit(Name.str());
+ }
+
+ /// Creates a file name for an implicit module.
+ static ModuleFileName make_implicit(std::string Name, uint32_t Separator) {
+ ModuleFileName File;
+ File.Path = std::move(Name);
+ File.Separator = Separator;
+ return File;
+ }
+
+ /// Creates a file name for an implicit module.
+ static ModuleFileName make_implicit(StringRef Name, uint32_t Separator) {
+ return make_implicit(Name.str(), Separator);
+ }
+
+ /// Returns the plain module file name.
+ StringRef str() const { return Path; }
+
+ /// Converts to StringRef representing the plain module file name.
+ operator StringRef() const { return Path; }
+
+ /// Checks whether the module file name is empty.
+ bool empty() const { return Path.empty(); }
+
+ /// Creates the deduplication key for use in \c ModuleManager.
+ std::optional<ModuleFileKey> makeKey(FileManager &FileMgr) const;
+};
+
/// The signature of a module, which is a hash of the AST content.
struct ASTFileSignature : std::array<uint8_t, 20> {
using BaseT = std::array<uint8_t, 20>;
@@ -926,4 +1021,23 @@ class VisibleModuleSet {
} // namespace clang
+template <> struct llvm::DenseMapInfo<clang::ModuleFileKey> {
+ static clang::ModuleFileKey getEmptyKey() {
+ return DenseMapInfo<const void *>::getEmptyKey();
+ }
+
+ static clang::ModuleFileKey getTombstoneKey() {
+ return DenseMapInfo<const void *>::getTombstoneKey();
+ }
+
+ static unsigned getHashValue(const clang::ModuleFileKey &Val) {
+ return hash_combine(Val.Ptr, Val.PathSuffix);
+ }
+
+ static bool isEqual(const clang::ModuleFileKey &LHS,
+ const clang::ModuleFileKey &RHS) {
+ return LHS == RHS;
+ }
+};
+
#endif // LLVM_CLANG_BASIC_MODULE_H
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index 266e0826b38f4..0d684d5c7f9fe 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -17,6 +17,7 @@
#include "clang/Frontend/PCHContainerOperations.h"
#include "clang/Frontend/Utils.h"
#include "clang/Lex/DependencyDirectivesScanner.h"
+#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/HeaderSearchOptions.h"
#include "clang/Lex/ModuleLoader.h"
#include "llvm/ADT/ArrayRef.h"
@@ -746,11 +747,6 @@ class CompilerInstance : public ModuleLoader {
GetDependencyDirectives = std::move(Getter);
}
- std::string getSpecificModuleCachePath(StringRef ContextHash);
- std::string getSpecificModuleCachePath() {
- return getSpecificModuleCachePath(getInvocation().computeContextHash());
- }
-
/// Create the AST context.
void createASTContext();
@@ -872,7 +868,7 @@ class CompilerInstance : public ModuleLoader {
void createASTReader();
- bool loadModuleFile(StringRef FileName,
+ bool loadModuleFile(ModuleFileName FileName,
serialization::ModuleFile *&LoadedModuleFile);
/// Configuration object for making the result of \c cloneForModuleCompile()
diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h
index d7000da682c6e..fd52a0ad5b007 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -282,6 +282,10 @@ class HeaderSearch {
/// The specific module cache path containing ContextHash (unless suppressed).
std::string SpecificModuleCachePath;
+ /// The length of the normalized module cache path at the start of \c
+ /// SpecificModuleCachePath.
+ size_t NormalizedModuleCachePathLen = 0;
+
/// All of the preprocessor-specific data about files that are
/// included, indexed by the FileEntry's UID.
mutable std::vector<HeaderFileInfo> FileInfo;
@@ -467,20 +471,20 @@ class HeaderSearch {
return {};
}
- /// Set the context hash to use for module cache paths.
- void setContextHash(StringRef Hash) { ContextHash = std::string(Hash); }
+ /// Initialize the module cache path.
+ void initializeModuleCachePath(std::string ContextHash);
- /// Set the module cache path with the context hash (unless suppressed).
- void setSpecificModuleCachePath(StringRef Path) {
- SpecificModuleCachePath = std::string(Path);
+ /// Retrieve the module cache path with the context hash (unless suppressed).
+ StringRef getSpecificModuleCachePath() const {
+ return SpecificModuleCachePath;
}
/// Retrieve the context hash.
StringRef getContextHash() const { return ContextHash; }
- /// Retrieve the module cache path with the context hash (unless suppressed).
- StringRef getSpecificModuleCachePath() const {
- return SpecificModuleCachePath;
+ /// Retrieve the normalized module cache path.
+ StringRef getNormalizedModuleCachePath() const {
+ return getSpecificModuleCachePath().substr(0, NormalizedModuleCachePathLen);
}
/// Forget everything we know about headers so far.
@@ -657,7 +661,7 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getCachedModuleFileName(Module *Module);
+ ModuleFileName getCachedModuleFileName(Module *Module);
/// Retrieve the name of the prebuilt module file that should be used
/// to load a module with the given name.
@@ -669,8 +673,8 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getPrebuiltModuleFileName(StringRef ModuleName,
- bool FileMapOnly = false);
+ ModuleFileName getPrebuiltModuleFileName(StringRef ModuleName,
+ bool FileMapOnly = false);
/// Retrieve the name of the prebuilt module file that should be used
/// to load the given module.
@@ -679,7 +683,7 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getPrebuiltImplicitModuleFileName(Module *Module);
+ ModuleFileName getPrebuiltImplicitModuleFileName(Module *Module);
/// Retrieve the name of the (to-be-)cached module file that should
/// be used to load a module with the given name.
@@ -691,8 +695,8 @@ class HeaderSearch {
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getCachedModuleFileName(StringRef ModuleName,
- StringRef ModuleMapPath);
+ ModuleFileName getCachedModuleFileName(StringRef ModuleName,
+ StringRef ModuleMapPath);
/// Lookup a module Search for a module with the given name.
///
@@ -808,13 +812,13 @@ class HeaderSearch {
/// \param ModuleMapPath A path that when combined with \c ModuleName
/// uniquely identifies this module. See Module::ModuleMap.
///
- /// \param CachePath A path to the module cache.
+ /// \param NormalizedCachePath The normalized path to the module cache.
///
/// \returns The name of the module file that corresponds to this module,
/// or an empty string if this module does not correspond to any module file.
- std::string getCachedModuleFileNameImpl(StringRef ModuleName,
- StringRef ModuleMapPath,
- StringRef CachePath);
+ ModuleFileName getCachedModuleFileNameImpl(StringRef ModuleName,
+ StringRef ModuleMapPath,
+ StringRef NormalizedCachePath);
/// Retrieve a module with the given name, which may be part of the
/// given framework.
@@ -1042,6 +1046,11 @@ void ApplyHeaderSearchOptions(HeaderSearch &HS,
void normalizeModuleCachePath(FileManager &FileMgr, StringRef Path,
SmallVectorImpl<char> &NormalizedPath);
+std::string createSpecificModuleCachePath(FileManager &FileMgr,
+ StringRef ModuleCachePath,
+ bool DisableModuleHash,
+ std::string ContextHash);
+
} // namespace clang
#endif // LLVM_CLANG_LEX_HEADERSEARCH_H
diff --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index 752e7fd288aa6..4e8fe1d32d42e 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -44,7 +44,7 @@ namespace serialization {
/// Version 4 of AST files also requires that the version control branch and
/// revision match exactly, since there is no backward compatibility of
/// AST files at this time.
-const unsigned VERSION_MAJOR = 35;
+const unsigned VERSION_MAJOR = 37;
/// AST file minor version number supported by this version of
/// Clang.
diff --git a/clang/include/clang/Serialization/ASTReader.h b/clang/include/clang/Serialization/ASTReader.h
index f254459ce933d..e9706d0ea2f2b 100644
--- a/clang/include/clang/Serialization/ASTReader.h
+++ b/clang/include/clang/Serialization/ASTReader.h
@@ -185,8 +185,7 @@ class ASTReaderListener {
/// otherwise.
virtual bool ReadHeaderSearchOptions(const HeaderSearchOptions &HSOpts,
StringRef ModuleFilename,
- StringRef SpecificModuleCachePath,
- bool Complain) {
+ StringRef ContextHash, bool Complain) {
return false;
}
@@ -304,8 +303,7 @@ class ChainedASTReaderListener : public ASTReaderListener {
bool Complain) override;
bool ReadHeaderSearchOptions(const HeaderSearchOptions &HSOpts,
- StringRef ModuleFilename,
- StringRef SpecificModuleCachePath,
+ StringRef ModuleFilename, StringRef ContextHash,
bool Complain) override;
bool ReadPreprocessorOptions(const PreprocessorOptions &PPOpts,
StringRef ModuleFilename, bool ReadMacros,
@@ -349,8 +347,7 @@ class PCHValidator : public ASTReaderListener {
bool Complain,
std::string &SuggestedPredefines) override;
bool ReadHeaderSearchOptions(const HeaderSearchOptions &HSOpts,
- StringRef ModuleFilename,
- StringRef SpecificModuleCachePath,
+ StringRef ModuleFilename, StringRef ContextHash,
bool Complain) override;
void ReadCounter(const serialization::ModuleFile &M, uint32_t Value) override;
};
@@ -1552,7 +1549,7 @@ class ASTReader
: Mod(Mod), ImportedBy(ImportedBy), ImportLoc(ImportLoc) {}
};
- ASTReadResult ReadASTCore(StringRef FileName, ModuleKind Type,
+ ASTReadResult ReadASTCore(ModuleFileName FileName, ModuleKind Type,
SourceLocation ImportLoc, ModuleFile *ImportedBy,
SmallVectorImpl<ImportedModule> &Loaded,
off_t ExpectedSize, time_t ExpectedModTime,
@@ -1888,7 +1885,7 @@ class ASTReader
/// NewLoadedModuleFile would refer to the address of the new loaded top level
/// module. The state of NewLoadedModuleFile is unspecified if the AST file
/// isn't loaded successfully.
- ASTReadResult ReadAST(StringRef FileName, ModuleKind Type,
+ ASTReadResult ReadAST(ModuleFileName FileName, ModuleKind Type,
SourceLocation ImportLoc,
unsigned ClientLoadCapabilities,
ModuleFile **NewLoadedModuleFile = nullptr);
diff --git a/clang/include/clang/Serialization/ModuleFile.h b/clang/include/clang/Serialization/ModuleFile.h
index 519a74d920129..4915ad0634fcd 100644
--- a/clang/include/clang/Serialization/ModuleFile.h
+++ b/clang/include/clang/Serialization/ModuleFile.h
@@ -159,6 +159,9 @@ class ModuleFile {
/// The file name of the module file.
std::string FileName;
+ /// All keys ModuleManager used for the module file.
+ llvm::DenseSet<ModuleFileKey> Keys;
+
/// The name of the module.
std::string ModuleName;
diff --git a/clang/include/clang/Serialization/ModuleManager.h b/clang/include/clang/Serialization/ModuleManager.h
index 8ab70b6630f47..162856f2f14c0 100644
--- a/clang/include/clang/Serialization/ModuleManager.h
+++ b/clang/include/clang/Serialization/ModuleManager.h
@@ -16,6 +16,7 @@
#include "clang/Basic/LLVM.h"
#include "clang/Basic/SourceLocation.h"
+#include "clang/Lex/HeaderSearch.h"
#include "clang/Serialization/ModuleFile.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/STLExtras.h"
@@ -56,8 +57,8 @@ class ModuleManager {
// to implement short-circuiting logic when running DFS over the dependencies.
SmallVector<ModuleFile *, 2> Roots;
- /// All loaded modules, indexed by name.
- llvm::DenseMap<const FileEntry *, ModuleFile *> Modules;
+ /// All loaded modules.
+ llvm::DenseMap<ModuleFileKey, ModuleFile *> Modules;
/// FileManager that handles translating between filenames and
/// FileEntry *.
@@ -172,14 +173,22 @@ class ModuleManager {
ModuleFile &operator[](unsigned Index) const { return *Chain[Index]; }
/// Returns the module associated with the given file name.
+ /// Soon to be removed.
ModuleFile *lookupByFileName(StringRef FileName) const;
/// Returns the module associated with the given module name.
ModuleFile *lookupByModuleName(StringRef ModName) const;
/// Returns the module associated with the given module file.
+ /// Soon to be removed.
ModuleFile *lookup(const FileEntry *File) const;
+ /// Returns the module associated with the given module file name.
+ ModuleFile *lookupByFileName(ModuleFileName FileName) const;
+
+ /// Returns the module associated with the given module file key.
+ ModuleFile *lookup(ModuleFileKey Key) const;
+
/// Returns the in-memory (virtual file) buffer with the given name
std::unique_ptr<llvm::MemoryBuffer> lookupBuffer(StringRef Name);
@@ -237,14 +246,13 @@ class ModuleManager {
///
/// \return A pointer to the module that corresponds to this file name,
/// and a value indicating whether the module was loaded.
- AddModuleResult addModule(StringRef FileName, ModuleKind Type,
- SourceLocation ImportLoc,
- ModuleFile *ImportedBy, unsigned Generation,
- off_t ExpectedSize, time_t ExpectedModTime,
+ AddModuleResult addModule(ModuleFileName FileName, ModuleKind Type,
+ SourceLocation ImportLoc, ModuleFile *ImportedBy,
+ unsigned Generation, off_t ExpectedSize,
+ time_t ExpectedModTime,
ASTFileSignature ExpectedSignature,
ASTFileSignatureReader ReadSignature,
- ModuleFile *&Module,
- std::string &ErrorStr);
+ ModuleFile *&Module, std::string &ErrorStr);
/// Remove the modules starting from First (to the end).
void removeModules(ModuleIterator First);
@@ -282,26 +290,6 @@ class ModuleManager {
void visit(llvm::function_ref<bool(ModuleFile &M)> Visitor,
llvm::SmallPtrSetImpl<ModuleFile *> *ModuleFilesHit = nullptr);
- /// Attempt to resolve the given module file name to a file entry.
- ///
- /// \param FileName The name of the module file.
- ///
- /// \param ExpectedSize The size that the module file is expected to have.
- /// If the actual size differs, the resolver should return \c true.
- ///
- /// \param ExpectedModTime The modification time that the module file is
- /// expected to have. If the actual modification time differs, the resolver
- /// should return \c true.
- ///
- /// \param File Will be set to the file if there is one, or null
- /// otherwise.
- ///
- /// \returns True if a file exists but does not meet the size/
- /// modification time criteria, false if the file is either available and
- /// suitable, or is missing.
- bool lookupModuleFile(StringRef FileName, off_t ExpectedSize,
- time_t ExpectedModTime, OptionalFileEntryRef &File);
-
/// View the graphviz representation of the module graph.
void viewGraph();
diff --git a/clang/lib/Basic/Module.cpp b/clang/lib/Basic/Module.cpp
index 97f742d292224..bfc29e32c3672 100644
--- a/clang/lib/Basic/Module.cpp
+++ b/clang/lib/Basic/Module.cpp
@@ -33,6 +33,23 @@
using namespace clang;
+s...
[truncated]
|
e558b32 to
2a23310
Compare
Bigcheese
left a comment
There was a problem hiding this comment.
Not sure about this todo, but the change otherwise looks good.
| if (ModuleEntry) | ||
| ModuleEntry->Keys.insert(*FileKey); | ||
|
|
||
| // TODO: Remove this. |
There was a problem hiding this comment.
At the time I thought there are multiple places that may look for the same file with inconsistent explicit/implicit path, so this seemed necessary to temporarily support both. But now that I look back, I might've been observing the fact that some tests used %t for both an include path and the module cache, so the DirectoryEntry got cached as non-existing. This made ModuleManager lookups fail with the implicit path, and turning it explicit and just looking up the full path in FileManager would succeed. I've removed this in 86ee991 and also removed the storage for multiple keys on ModuleFile in a109526, since that was there only to support this hack.
LMK if you're happy with how it looks now.
|
I made couple of changes in 7b6c696..2323be8:
Can you PTAL? |
🪟 Windows x64 Test Results
✅ The build succeeded and all tests passed. |
| // expected-remark{{importing module 'a' into 'b'}} | ||
|
|
||
| // Module cache via symlink. | ||
| // RUN: ln -s %t/cache %t/cache-symlink |
There was a problem hiding this comment.
I assume it's this bit that's making it fail on Windows. Maybe split this part out and add a requires to skip it on Windows? I don't think we have a good way to make symlinks on Windows yet.
There was a problem hiding this comment.
Actually this command succeeds, but I'm not sure if it ends up creating a hardlink.
Bigcheese
left a comment
There was a problem hiding this comment.
Looks good except the Windows issue. I'm not sure if that's a real issue with the code, or just with ln -s on Windows.
|
Yeah, that's just |
This PR changes how
ModuleManagerdeduplicates module files.Previously,
ModuleManagerusedFileEntryfor assigning unique identity to module files. This works fine for explicitly-built modules because they don't change during the lifetime of a single Clang instance. For implicitly-built modules however, there are two issues:FileEntryobjects are deduplicated byFileManagerbased on the inode number. Some file systems reuse inode numbers of previously removed files. Because implicitly-built module files are rapidly removed and created, this deduplication breaks and compilations may fail spuriously when inode numbers are recycled during the lifetime of a single Clang instance.ModuleManagerdoes when loading a module file is consulting theFileManagerand checking the file size and modification time match the expectation of the importer. This is done even when such module file already lives in theInMemoryModuleCache. This introduces racy behavior into the mechanism that explicitly tries to solve race conditions, and may lead into spurious compilation failures.This PR identifies implicitly-built module files by a pair of
DirectoryEntryof the module cache path and the path suffix<context-hash>/<module-name>-<module-map-path-hash>.pcm. This gives us canonicalization of the user-provided module cache path without turning toFileEntryfor the PCM file. The path suffix is Clang-generated and is already canonical.Some tests needed to be updated because the module cache path directory was also used as an include directory. This PR relies on not caching the non-existence of the module cache directory in the
FileManager. When other parts of Clang are trying to look up the same path and cache its non-existence, things break. This is probably very specific to some of our tests and not how users are setting up their compilations.